Chef Delivery on a laptop

This post describes how you can run Chef Delivery on a laptop, using Vagrant. My main intent is to give you a way to work through Chef’s Delivery tutorial if you do not have access to AWS – or if you are really lost using a Windows workstation. Use the AWS CloudFormation template in the Tutorial if you can – the approach in this post is more error-prone. You have been warned.

I’m going to assume you are reasonably familiar with Linux, Vagrant and using the knife command – and that you already have Vagrant and Virtualbox installed and working.

We’re going to spin up multiple (at least 4) virtual machines. I’m using an 8-core Ubuntu laptop with 16GB memory. If you are running on something much smaller, good luck.

Plan of Attack

The first part (the bulk) of this post, Setting up Chef Delivery, replaces the Install Chef Delivery on AWS section of the tutorial. In it, we’ll use the delivery-cluster cookbook to provision a Chef Server, Delivery Server and build node as virtual machines using Vagrant.

The second part, Configuring the Workstation, sets up a workspace to use with Chef Delivery. If refers to and partly replaces the second and third sections of the Tutorial.

The final part of this post, Following the Chef Delivery Tutorial gives you some pointers on working through the remaining sections of the Chef Delivery tutorial (the ‘meat’ of the tutorial) using this setup. In particular, the first time you send the demo application through the pipeline, it won’t quite work, due to an interesting ‘chicken and egg’ problem. I’ll explain why and what to do about it when we get there.

Setting up Chef Delivery

Prerequisites

  • Vagrant
  • Virtualbox
  • ChefDK 0.15.5 or more recent (includes Delivery CLI)
  • Git
  • build-essential (Ubuntu) or comparable development library

If you already have ChefDK installed, please check its version and upgrade if needed:

chef --version
 Chef Development Kit Version: 0.15.15
 chef-client version: 12.11.18
 delivery version: 0.0.23 (bf89a6b776b55b89a46bbd57fcaa615c143a09a0)
 berks version: 4.3.5
 kitchen version: 1.10.0

If it’s an earlier version, you may not have the Delivery CLI and you may encounter errors when using delivery-cluster.

You need the development environment library appropriate to your OS, e.g. for Ubuntu:

sudo apt-get install build-essential

(see Chef Delivery docs for other platforms)

Obtain Chef Delivery license

If you do not have a license already, you can obtain a temporary license that will let you use Chef Delivery for the tutorial.

Copy the license into the home directory on your workstation.

cp delivery.license ~

The delivery-cluster cookbook will handle putting the license onto the Delivery Server, once created.

Prepare to provision using delivery-cluster

Clone the delivery-cluster repo from Github:

git clone https://github.com/opscode-cookbooks/delivery-cluster.git ~/delivery-cluster

Run the following from within that repo to generate the provisioning settings to be used:

cd ~/delivery-cluster
rake setup:generate_env

Accept the default of test for Environment Name and Cluster ID.  Change the Driver Name to vagrant:

Global Attributes
Environment Name [test]: 
Cluster ID [test]:

Available Drivers: [ aws | ssh | vagrant ]
Driver Name [aws]: vagrant

Accept the default SSH Username and optionally change the Box Type and Box URL to use Ubuntu (which is what I am using). The default Centos selection should also work.

Driver Information [vagrant]
SSH Username [vagrant]: 
Box Type:  [opscode-centos-6.6]: opscode-ubuntu-14.04
Box URL:  [....]: https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box

Here’s the Ubuntu Box URL for easier copy/paste:

https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box

Take the defaults for all other settings. You can see all of the expected settings in the output JSON shown in the next Section.

Be aware this setup is slightly different from the Tutorial – specifically:

  • the IP addresses used for the Chef Server, Delivery Server and build node(s) are ‘33.33.33.xx’ rather than ‘10.0.0.xx’
  • the Delivery Enterprise name is ‘test’ rather than ‘delivery-demo’

The rake command will generate the environment in ~/delivery-cluster/environments/test.json. You can rerun the command or edit the file directly if you need to.

Update the environment to include FQDN

We need to make a manual update to the Delivery environment file to work around an issue  with provisioning Delivery to vagrant.  Without this workaround, various configuration files will incorrectly use an IP address of 10.0.2.15 when trying to communicate with the Chef Server or Delivery Server. To avoid this, we need to specify the FQDN to be used for these servers.

Edit the file ~/delivery-cluster/environments/test.json so that the FQDN is specified:

{
  "name": "test",
  "description": "Delivery Cluster Environment",
  "json_class": "Chef::Environment",
  "chef_type": "environment",
  "override_attributes": {
    "delivery-cluster": {
      "accept_license": true,
      "id": "test",
      "driver": "vagrant",
      "vagrant": {
        "ssh_username": "vagrant",
        "vm_box": "opscode-ubuntu-14.04",
        "image_url": "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box",
        "key_file": "/home/test/.vagrant.d/insecure_private_key"
      },
      "chef-server": {
        "fqdn": "33.33.33.10",
        "organization": "test",
        "existing": false,
        "vm_hostname": "chef.example.com",
        "network": ":private_network, {:ip => '33.33.33.10'}",


        "vm_memory": "2048",
        "vm_cpus": "2"
      },
      "delivery": {
        "fqdn": "33.33.33.11",
        "version": "latest",
        "enterprise": "test",
        "license_file": "/home/test/delivery.license",
        "vm_hostname": "delivery.example.com",
        "network": ":private_network, {:ip => '33.33.33.11'}",
        "vm_memory": "2048",
        "vm_cpus": "2",
        "disaster_recovery": {
          "enable": false
        }
      },
      "builders": {
        "count": "1",
        "1": {
         "fqdn": "33.33.33.14",
         "network": ":private_network, {:ip => '33.33.33.14'}",
          "vm_memory": "2048",
          "vm_cpus": "2"
        }
      }
    }
  }
}

Note: It may not be necessary to specify the FQDN for the build node.

Provision the Servers

To start provisioning run:

export CHEF_ENV=test
rake setup:cluster

Watch it for a few minutes, to make sure there’s no early failure. You should see it create a Vagrant machine for the chef server and start to run recipes to install and configure the server.

If it’s going OK, go for coffee. Actually, go for lunch. This will take a while. There’s a lot going on… not only is it installing the Chef Server, Delivery Server and Build node, but also setting up credentials and certificates.

Sometimes a download will timeout and the provisioning run will fail part way through. If this happens, try rerunning it.

Hopefully you will come back and find a successfully completed Chef run. The last node provisioned should have been the build node.

Now we need to make sure it actually worked.  Let’s start by getting the information that will let us logon to the Servers. Run the following rake command:

rake info:delivery_creds
Username: delivery
Password: XSomeGeneratedPassword=
Chef Server URL: https://33.33.33.10/organizations/test

Delivery Server
Created enterprise: test
Admin username: admin
Admin password: +cAnotherGeneratedPasswordg=
Builder Password: UtAndAnotherGeneratedPassword4=
Web login: https://33.33.33.11/e/test/

You should be able to logon to both the Chef Server and the Delivery Server using the URLs and credentials provided by the rake command. You will need to confirm a security exception with the browser, as we’re using self-signed certificates.

If you’re using Firefox and have previously installed Delivery, you may need to clear the old certificates from the browser first. Firefox will give you Error code: SEC_ERROR_REUSED_ISSUER_AND_SERIAL. Go to ‘Preferences > Advanced > Certificates > View Certificates’ in the Firefox menu and delete the entries for 33.33.33.10 from the ‘Server’ and ‘Authorities’ tabs.

For a final smoke test, run the following knife command from within the delivery-cluster repo:

knife node status
build-node-test-1    available

Chef Delivery uses push-jobs, and the above command lists nodes that are visible to push-jobs. If you do not see your build node(s) when you run the command, something has  gone wrong (Chef server certificate problem, incorrect IP address, ….). Double check your environment file and try rerunning the rake command.

If you do make a significant change to the environment settings (e.g. changing the box type), I recommend destroying the virtual machines (see next section) and starting with a fresh clone of delivery-cluster, to remove cached provisioning information.

Managing the virtual machines

The delivery-cluster cookbook creates Vagrant specifications for the virtual machines in ~/delivery-cluster/.chef/vms. From there, you can ssh, halt and start up the VMs, e.g.:

cd ~/delivery-cluster/.chef/vms

ls
build-node-test-1.vm  delivery-server-test.vm      Vagrantfile
chef-server-test.vm   

vagrant halt
...
vagrant up
...
vagrant ssh build-node-test-1
...

Configuring the Workstation

In this section, we’re going to follow the Tutorial section 2 to create a Delivery organization and user.  But we’re going to short-cut a step by first generating an SSH key for the user. This only makes sense because we are both the user and Administrator of Chef Delivery on the workstation: the Tutorial splits the actions because that is more representative of normal use.

Generate an SSH Key for the Chef Delivery User

Generate an ssh key to use in your Chef Delivery user account:

ssh-keygen -t rsa -b 4096 -C "test@example.com"

I recommend saving it to /home/<user>/.ssh/delivery_rsa rather than the default id_rsa file. Do not enter a passphrase (press <Enter> twice when prompted).

Create or append the following in your ~/.ssh/config file, to make sure that the above key is used when communicating with the Delivery git server:

Host 33.33.33.11
        IdentityFile /home/test/.ssh/delivery_rsa
        User test
        IdentitiesOnly yes

test is the name of the user we are going to create in Chef Delivery.

Create an Organization and User

Logon to the Delivery UI at:

https://33.33.33.11/e/test/

using the admin user and password from the rake info:delivery-creds command.

Follow  Tutorial Section 2 to create the ‘delivery-demo’ organization and ‘test’ user. Set the user id to ‘test’ rather than ‘jsmith’. Specify the SSH public key at the same time as creating the user.

Your public key is here:

cat ~/.ssh/delivery_rsa.pub

Once the user is created, verify the setup and create a ‘known hosts’ entry by authenticating to Delivery git server.

ssh -l test@test -p 8989 33.33.33.11

The authenticity of host '[33.33.33.11]:8989 ([33.33.33.11]:8989)' can't be established.
RSA key fingerprint is 11:ce:26:01:b3:ee:f7:7f:4c:e5:ea:a5:91:a6:0d:6a.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '[33.33.33.11]:8989' (RSA) to the list of known hosts.
Hi test@test! You've successfully authenticated, but Chef Delivery does not provide shell access.
                 Connection to 33.33.33.11 closed.

If you get ‘Permission denied’, check that you have set the correct public key and user name in Chef Delivery, and the correct key file and user name in the ssh config file. Also check that the private key (~/.ssh/delivery_rsa) is only readable by the user (e.g. mode ‘0600’).

If you’ve previously installed Delivery,  you may get a warning because of a previous ‘known hosts’ entry for 33.33.33.11:

@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!
Someone could be eavesdropping on you right now (man-in-the-middle attack)!
It is also possible that a host key has just been changed.
The fingerprint for the RSA key sent by the remote host is
11:ce:26:01:b3:ee:f7:7f:4c:e5:ea:a5:91:a6:0d:6a.
Please contact your system administrator.
Add correct host key in /home/test/.ssh/known_hosts to get rid of this message.
Offending RSA key in /home/test/.ssh/known_hosts:8
  remove with: ssh-keygen -f "/home/test/.ssh/known_hosts" -R [33.33.33.11]:8989
RSA host key for [33.33.33.11]:8989 has changed and you have requested strict checking.
Host key verification failed.

It is OK to remove the ‘known hosts’ entry using the ssh-keygen command given in the message, because you know you have created a new VM using the same address.

Configure Workstation to use Chef Delivery Server

We’re now going to setup a workspace for Delivery projects:

mkdir -p ~/delivery-demo
cd ~/delivery-demo
delivery setup --ent=test --org=delivery-demo --user=test --server=33.33.33.11

The delivery setup command creates a .delivery/cli.toml file which is used by the delivery CLI whenever it is run in ~/delivery-demo or any subdirectory.

cat ~/delivery-demo/.delivery/cli.toml
 api_protocol = "https"
 enterprise = "test"
 git_port = "8989"
 organization = "delivery-demo"
 pipeline = "master"
 server = "33.33.33.11"
 user = "test"

Create Acceptance Test Node

The last thing we need to do to configure the workstation is create the test node(s) for the demo application. The Tutorial only requires a single test node, in the Acceptance environment. Normally, you would need one or more nodes in each of the Acceptance, Union, Rehearsal and Delivered environments.

Create a Vagrantfile in ~/delivery-demo/Vagrantfile and copy the following into it:

Vagrant.configure('2') do |outer_config|
  outer_config.vm.define "acceptance-test-1" do |config|
    config.vm.network(:private_network, {:ip => '10.0.0.15'})
    config.vm.box = "opscode-ubuntu-14.04"
    config.vm.box_url = "https://opscode-vm-bento.s3.amazonaws.com/vagrant/virtualbox/opscode_ubuntu-14.04_chef-provisionerless.box"
    config.vm.hostname = "acceptance-test-delivery-demo-1"
  end
end

Now use it to start a new Ubuntu VM:

cd ~/delivery-demo
vagrant up

We need to bootstrap this node and register it with the Chef server. The node needs to be in the acceptance environment for the delivery-demo project, which will be named ‘acceptance-test-delivery-demo-awesome_customers_delivery-master’ (acceptance-<enterprise>-<organization>-<project>-<pipeline>) .

cd ~/delivery-cluster
knife environment create acceptance-test-delivery-demo-awesome_customers_delivery-master
knife bootstrap 10.0.0.15 --node-name awesome_customers_delivery-acceptance \
  --environment acceptance-test-delivery-demo-awesome_customers_delivery-master \
  --run-list "recipe[apt],recipe[delivery-base]" -xvagrant -Pvagrant --sudo
knife node run_list set awesome_customers_delivery-acceptance \
  "recipe[apt],recipe[delivery-base],recipe[awesome_customers_delivery]"

We bootstrap using the ‘delivery-base’ recipe, as this will install Chef push-jobs. We then set its runlist to include the ‘awesome_customers_delivery’ cookbook, as that is what we are testing. Note we cannot bootstrap with this recipe because it is not loaded into the Chef Server yet.

Following the Chef Delivery Tutorial

You should now be able to follow the Chef Delivery Tutorial, starting at the fourth section to Create a project.  Go through to the first step of  Step 8 (Deliver the Change) of that Section. In that step, you will watch your project go through the Acceptance Phase and then try to navigate to the application in acceptance test at http://10.0.0.15. It won’t be there.

What went wrong? First, I recommend reading the ‘Learn more about the deployment process’ foldout in Step 8. Then in the Delivery UI, look carefully at the Deploy stage. At the end you will see something like:

Recipe: delivery-truck::deploy
  * delivery_push_job[deploy_awesome_customers_delivery] action dispatch (up to date)

Running handlers:
Running handlers complete
Chef Client finished, 0/1 resources updated in 02 seconds

Basically, the push job that should have run chef-client on the acceptance test node did nothing. Why?

The delivery-truck::deploy recipe searches for nodes in the correct environment, with push-jobs and the project cookbook in their list of recipes.  Specifically, the search term used is "recipes:#{cookbook.name}*". This search term will match against the last-run recipes on the node, NOT against the recipes in the current run-list. When we bootstrapped the acceptance test node, we did not include the awesome_customers_delivery cookbook because it had not been uploaded to the Chef Server yet. The application cookbook was only uploaded in the Publish stage of the Verify phase. This is the ‘chicken-and-egg’ situation I referred to earlier.

To get round this, we will run a one-off manual push-job to converge the node:

cd ~/delivery-cluster
knife job start chef-client awesome_customers_delivery-acceptance

You should now be able to navigate to the application at http://10.0.0.15. Future runs through the pipeline will automatically run the push-job and what you will see in the Acceptance Deploy phase is:

Converging 1 resources
Recipe: delivery-truck::deploy
  * delivery_push_job[deploy_awesome_customers_delivery] action dispatch
    - Dispatch push jobs for chef-client on awesome_customers_delivery-acceptance

Running handlers:
Running handlers complete

Continue with  Step 8 where you click ‘Deliver’ in the Acceptance Stage of the pipeline. You should now be able to finish the remaining sections in the Tutorial.

 

 

 

 

Avoiding the possible pitfalls of derived attributes

What this post is about

This post is intended for folks who are comfortable with the basics of attributes in Chef, but want to understand some of the subtleties better. It focusses on one specific aspect – derived or computed attributes – and how to make sure they end up with the value you intend. I’m going to cover four topics:

This post is with huge thanks to the many people at the 2014 Chef Summit and online who helped me with this topic, including but not limited to Noah Kantrowski, Julian Dunn and Lamont Granquist. The good ideas are theirs, any mistakes are mine.

Attribute precedence in practice

First: it’s not all bad news. Much of the time the attribute precedence scheme in Chef 11, although complex, will do what you want it to. The complexity is there because Chef supports multiple different approaches to customizing attribute values, particularly (1) using wrapper cookbooks versus (2) using roles and environments. You can see some of these tradeoffs in this description of Chef 11 attribute changes.

Here’s a reminder of the attribute precedence scheme. The highest numbers indicate the highest precedence:


Image linked from Chef documentation.

One benefit of the above scheme is that you can override default attributes with new values in a wrapper cookbook at default level. You do not need to use a higher priority level. This is important because you can wrapper the wrapper if you have to, without suffering “attribute priority inflation”. Why wrapper a wrapper? It can be very useful when you need multiple levels of specialization, e.g. to set defaults for an organization; override some of those defaults for a business system, and then do further customizations for a specific deployment of that business system.

A benefit of the precedence scheme when working with roles and environments is that you can set default attributes in a role or environment, and they will override the default attributes in cookbooks. The mental model is that your cookbooks are designed to be generally reusable, and have the least context-awareness. An environment provides additional context such as implementing the policies specific to your organization. Similarly, roles can configure the recipes to meet a specific functional purpose.

The possible pitfalls of derived attributes

So where can it go wrong? Let’s use a simple example, consisting of a base cookbook called “app”, and a wrapper cookbook called “my_app” which has a dependency on “app” in its metadata.rb. The contents of those cookbooks are:

cookbooks/app/attributes/default.rb:

  default["app"]["name"] = "app"
  default["app"]["install_dir"] = "/var/#{node["app"]["name"]}"

-------------------------------------------------------------------------------
cookbooks/app/recipes/default.rb:

ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, app name is: #{node['app']['name']};" +
      " install_dir is #{node['app']['install_dir']}"
  end
end

------------------------------------------------------------------------------
cookbooks/my_app/attributes/default.rb:

  default["app"]["name"] = "my_app"

------------------------------------------------------------------------------
cookbooks/my_app/recipes/default.rb:

  include_recipe "app::default"

And they are uploaded to the server using:

knife cookbook upload app my_app

The base “app” cookbook has an application “name” attribute which defaults to “app”, and an “install_dir” attribute which is calcyulated from the application name. For simplicity, the recipe which would actually deploy “app” just prints out the value of the attributes using a ruby block so that we see the values that would be used when the resources are run. The wrapper “my_app” cookbook changes the application name attribute from “app” to “my_app”.

What happens if we run the wrapper cookbook?

sudo chef-client -r 'recipe[my_app]' -linfo

run1

The “name” attribute is set to “my_app”, however the derived “install_dir” attribute still has its old value of “/var/app”, which is probably not what was intended.  This is not a question of priority: if the wrapper contained override["app"]["name"] = "my_app", we would get the same result. To understand why this happens, we need to look at the order of evaluation of the attributes.

What happens during this chef-client run in this example is as follows:

  1. As there are no roles or environments, the “compile” phase starts by evaluating attribute files based on the cookbooks in the runlist and their dependencies.
  2. The first cookbook in the list is “my_app”, which has a dependency on “app”. Dependencies are loaded first, so the default “name” attribute is set to “app” and the default “install_dir” attribute is set to “/var/app”.
  3. The “my_app” wrapper attribute file is loaded second and updates the default “name” attribute to “my_app”. The “install_dir” attribute is not updated and therefore keeps its value of “/var/app”.
  4. After that, the recipe files are loaded and the ruby_block resource is added to the resource collection, instantiated with the current values of the “name” and “install_dir” attributes.
  5. The “converge” phase executes the resources in the resource collection, printing out the values “my_app” and “/var/app”.

How attribute values are determined

Basic model

The following diagrams may help explain how attribute values are evaluated.

First, let’s work with a runlist like the following, consisting of three recipes in three cookbooks (cb1, wcb, cb3). The second recipe(wcb::rc2) is a wrapper of a recipe in a fourth cookbook (cb2::r2). Each cookbook has a single attribute file (a_cb1, etc).

runlist

The diagram below illustrates how the values of the attributes in this example change through the run. The attribute files are evaluated in runlist order but with dependencies (from metadata.rb) evaluated first. In this case, ‘a_cb2’ is evaluated after ‘a_cb1’ but before ‘a_wcb’. As the attribute files are evaluated, attribute values are put into “buckets” based on the attribute name and priority, e.g. node.default['x'] updates the value of ‘x’ in the default bucket for ‘x’. Each subsequent update to the same attribute and priority replaces the value in that bucket.

compile-converge

When the recipes are run and they access an attribute e.g. node['x'], the value that is passed back is that of the highest priority bucket that has a value in it.

Here’s an example showing the problem with a derived attribute. In the first step when “y” is calculated, the value of “x” is “app” and so “y” is set to “/var/app”. The value of “x” is set to “my_app” in the second step. When recipe r2 retrieves the attribute values in the third step, it therefore gets “my_app” and “/var/app”:

eval1

This diagram shows why using a higher priority does not solve the problem. Again, “y” is calculated in the first step and “x” is not set to “my_app” until the second step:

eval2

There is a wrinkle that you should be aware of. If you chose to set “normal” precedence rather than “override” in the above, the first run would still give the same result as above, but subsequent runs would “work”. Normal attributes are special because they persist across chef-client runs. If the wrapper cookbook contained normal['x']="my_app", “y” would still be computed as “/var/app” on the first run. On the second run, however, it would change to “/var/my_app”, because “my_app” would be in the “normal” bucket at the start of evaluation and would be used in the first step to calculate “y”.

Model including roles

Roles introduce two changes to our model:

  1. Role attributes have a higher precedence than those in cookbooks, effectively creating two new rows of buckets labelled as “role_default” and “role_override” in the diagram below
  2. Role attributes are always evaluated before cookbook attributes, regardless of their runlist position

eval3

These precedence rules mean that you can use roles to avoid the derived attribute problem, as shown below:

eval3

Setting the default value of “x” to “my_app” in the role guarantees that the value of “my_app” will be present when “y” is evaluated in cookbook cb2. “my_app” will be used rather than “app” because a role default value takes precedence over the cookbook default (it is in a higher priority bucket).

Model including environments

Environments add two new precedence levels, one between default and role_default; one after role_override and before “automatic”. Like roles, they are always evaluated before cookbook attributes.

Some ways to solve the problem

As a user of a cookbook with a derived attribute

As a user of a cookbook with a derived attribute and you do not have the option of modifying the base cookbook, you have two basic choices:

  • Always set any computed attributes if you change the attributes that they are derived from
  • Use a role or environment

Set computed attributes

The simplest approach is to make sure you set all of the attributes that are derived from attributes that you want to change. In our original example, we would specify both “name” and “install_dir”, e.g.:

my_app/attributes/default.rb:

  default["app"]["name"] = "my_app"
  default["app"]["install_dir"] = "/var/my_app"

This is probably the approach you will want to take if you use the wrapper cookbook approach.

Use a role or environment

As explained in the Model including roles section, attributes in roles have priority over attributes in cookbooks, and are also always evaluated before them. If you use roles, then setting an attribute in a role will also change any computed attributes. In our original example, we could define myapp role as:

roles/myapp.json:
{
  "name": "myapp",
  "default_attributes": {
    "app": {
      "name": "my_app"
    }
  }
}

knife role from file roles/myapp.json

Then run with a modified runlist:

sudo chef-client -r 'recipe[app]','role[myapp]' -linfo

The result would be to set both the “name” attribute to “my_app”, and the “install_dir” attribute to “/var/my_app”.

As an author of a cookbook with a derived attribute

As an author of a cookbook, you may prefer not to rely on users noticing derived attributes and handling them appropriately. Here are some possibilities to make life easier for your users:

  • Use a variable and not an attribute
  • Use delayed evaluation in the recipe
  • Use conditional assignment in the recipe

A gist for this example.

Use a variable and not an attribute

If the derived value should always be calculated, then don’t use an attribute, use a ruby variable in the recipe. In our original example, if “install_dir” should always be “/var” followed by the application name, remove the derived attribute and instead do the following in the recipe:

app/attributes/default.rb:

  default["app"]["name"] = "app"

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = "/var/#{node["app"]["name"]}"
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

Similarly, if the user needs to be able to change the root path for the install directory but the application should always be installed in a directory with the application name, create two attributes for “root_path” and “name”, and combine them using a variable:

app/attributes/default.rb:

  default["app"]["name"] = "app"
  default["app"]["root_path"] = "/var"

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = "#{node["app"]["root_path"]}/#{node["app"]["name"]}"
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

Use delayed evaluation in the recipe

Noah Kantrowitz proposed an approach for delaying evaluation of the derived attribute into the recipe, whilst still allowing it to be defined and overridden in the attribute file.

This approach sets up a template for the derived attribute in the attribute file, using the ruby %{} operator to define a placeholder. It then uses the ruby % operator in the recipe file to perform string interpolation, i.e. to substitute the actual value of the placeholder. In our original example, this would look like:

app/attributes/default.rb:

  default["app"]["name"] = "app"
  default["app"]["install_dir"] = "/var/%{name}"

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = node["app"]["install_dir"] % { name: node["app"]["name"]}
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

node["app"]["install_dir"] % { name: node["app"]["name"]} causes Ruby to substitute the value of the “name” attribute wherever the placeholder “%{name}” appears in the “install_dir” attribute. Because this substitution is delayed until the recipe is evaluated, the “name” attribute has already been set by the wrapper cookbook, and “install_dir” will be set to “/var/my_app”.

One consequence of this approach is that the “install_dir” attribute will have a value of “/var/%{name}” in the node object at the end of the run. This may not be desirable if “install_dir” was something you used in node searches. It also means that any cookbooks that reference the “install_dir” attribute need to perform the placeholder substitution before using it.

Use conditional assignment in the recipe

This approach is based on something suggested by Lamont Granquist. It uses conditional logic in the recipe that will only set the default value if no other value has been provided in a wrapper cookbook. Our example would look like this:

app/attributes/default.rb:

  default["app"]["name"] = "app"
  # default["app"]["install_dir"] = "/var/#{node['app']['name']" 

-------------------------------------------------------------------------------
app/recipes/default.rb:

install_dir = node["app"]["install_dir"] || "/var/#{node['app']['name']"
ruby_block "Executing resource in recipe" do
  block do
    Log.info "Executing recipe, application name is: #{node['app']['name']};" +
      " install_dir is #{install_dir}"
  end
end

The line for “install_dir” in the attribute file is commented out, so that it does not take effect but a user can see that the attribute exists and can be overridden. The line install_dir = node["app"]["install_dir"] || "/var/#{node['app']['name']" will take any overridden value of the node attribute, but otherwise will set it based on the “name” attribute. The conditional assignment is important because otherwise it would overwrite an assignment done in the wrapper cookbook.

With this code, the “install_dir” attribute saved in the node object will be null unless it has been overridden. If you want the actual value used to be saved, you may want to conditionally set the node attribute rather than a variable, e.g. node.default["app"]["install_dir"] = "/var/#{node['app']['name']" unless node["app"]["install_dir"].

Getting started with Chef report and exception handlers

This post is for people who want to use or write their first Chef report handler, but aren’t sure where to begin. I first attempted to write a handler just after I learned how to write basic Chef recipes. It was hard, because I only had a rudimentary understanding of Chef mechanics, and I was learning Ruby as I went. This article would have got me started faster. However, in the end you are going to be writing Ruby, and probably you’ll need to get deeper into Chef too.

The three types of handler

There are three types of handler:

  • Start handler – Runs at the beginning of a chef-client run
  • Report handler – Runs at the end of a chef-client run, after all of the recipes have successfully completed
  • Exception handler – Runs at the end of a chef-client run if it exits with an error code

I am going to gloss over start handlers: they are less common and somewhat more complex because you have to get the handler in place before the chef-client run happens (so you can’t distribute them in a recipe).

Report handlers are useful for gathering information about a chef-client run (e.g. what cookbook versions were run, what resources were updated). Exception handlers are useful to capture information about failed runs, or to perform cleanup on exception (e.g. cleaning up frozen filesystem resources).

Using the built-in handlers

The most basic place to start is being able to run one of the built-in handlers. At current time of writing, there are two handlers built in to the chef-client: the json_file handler, and the error_report handler. The benefit of starting here is you don’t have to worry about how to get the handler code onto the nodes that you want to run them on – they’re distributed with the chef-client.

Running the error_report handler

Let’s start with the error_report handler. To run this, all you need to do is use the ‘chef_handler’ cookbook and add the following into a recipe in your runlist:

include_recipe "chef_handler"

chef_handler "Chef::Handler::ErrorReport" do
  source "chef/handler/error_report"
  action :enable
end

After a chef-client run, this will create a file called ‘failed_run_data.json’ in the chef-client cache (typically ‘/var/chef/cache’) on the node it is running on.

Despite its name, this handler can be useful whether or not the run fails. Assuming your run succeeded, here’s what you’ll find in the ‘failed_run_data.json’ file.

{
"node": {
  "name": "m2",
  "chef_environment": "_default",
  "json_class": "Chef::Node",
  "automatic": {
  "kernel": {
  "name": "Linux",
...

The JSON data starts off with details about the node attributes, similar to what you get with ‘knife node show m2 -l -Fjson'.

  "success": true,
  "start_time": "2014-08-31 20:43:53 +0000",
  "end_time": "2014-08-31 20:44:28 +0000",
  "elapsed_time": 34.995100522,

It then lists some basic details about the chef-client run.

  "all_resources": [
    {
      "json_class": "Chef::Resource::ChefHandler",
      ...
    }
  ],
  "updated_resources": [
    {
      "json_class": "Chef::Resource::ChefHandler",
      ...
    }
  ],

The next JSON elements describe all of the resources that were part of the chef-client run, and which were updated.

  "exception": null,
  "backtrace": null,
  "run_id": "53e02623-1bc9-4b33-a08d-eeb89936feca"

And then finally there is information about the exception that occurred (which is null in this case, a successful run).

Running error_report handler when an exception occurs

Let’s create an exception by adding an invalid resource to the recipe:

include_recipe "chef_handler"

execute "brokenthing"

chef_handler "Chef::Handler::ErrorReport" do
  source "chef/handler/error_report"
  action :enable
end

The chef-client run will fail with an error:

Errno::ENOENT: No such file or directory - brokenthing

But if you copied my recipe, the error report won’t have been updated! Why?

The problem is that the failing ‘execute’ resource happened before the error report handler was enabled. If you move the ‘execute’ resource after the chef_handler resource, the error report will be created. So one takeaway is that it is good practice to define handlers in a recipe that you put at the start of the runlist. But that’s not all that you want to do.

If the handler is the first resource encountered in the run, then it will report any errors happening when subsequent resources are executed. But sometimes exceptions happen before this, in what’s called the ‘compile’ phase (see About the chef-client Run). This is the phase where the chef-client constructs a list of all of the resources and actions that it is meant to perform, which it then executes in the ‘converge’ phase. For a much deeper explanation of this, see The chef resource run queue.

What we want to do is to enable the error report handler as early as possible in the compile phase, so it can catch errors occurring during this phase too. Here’s how to do that:

include_recipe "chef_handler"

execute "brokenthing"

chef_handler "Chef::Handler::ErrorReport" do
  source "chef/handler/error_report"
  action :nothing
end.run_action(:enable)

The end.run_action(:enable) tells Chef to do the “enable” action immediately on encountering the resource (i.e. during ‘compile’). The action :nothing tells Chef that it does not need to do anything during the ‘converge’ phase (as its already been enabled).

With this change, now you will find that the end of the error report has exception and backtrace information:

  "exception": "Errno::ENOENT: execute[brokenthing] (testapp::handlers line 11) had an error: Errno::ENOENT: No such file or directory - brokenthing",
  "backtrace": [
    "/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/mixlib-shellout-1.4.0/lib/mixlib/shellout/unix.rb:320:in `exec'",
    "/opt/chef/embedded/lib/ruby/gems/1.9.1/gems/mixlib-shellout-1.4.0/lib/mixlib/shellout/unix.rb:320:in `block in fork_subprocess'",
...

Only running error_report on exception

Perhaps you do not want to run the error_report handler on every run, just when an exception occurs. To do this, we override the default supports attribute on the resource to specify it should be used with exceptions only:

chef_handler "Chef::Handler::ErrorReport" do
  source "chef/handler/error_report"
  action :nothing
  supports :exception=>true
end.run_action(:enable)

Replace :exception with :report if you only want the handler to run when the run is successful.

Running the json_file handler

The json_file handler is like the error_report handler, but puts the results into a timestamped file. The following chef_handler resource will result in data being written to /var/chef/reports/chef-run-report-20140831204047.json, for a run that started at 20:40:47 on 31st August 2014.

chef_handler 'Chef::Handler::JsonFile' do
  source 'chef/handler/json_file';
  arguments :path => '/var/chef/reports'
  action :nothing
end.run_action(:enable)

This example illustrates how to pass parameters into a handler.

You can also choose to use the ‘json_file’ recipe in the chef_handler cookbook to achieve the same result.

An alternative – using the client.rb file

An alternative (as described in the Chef docs) to enabling the handler using the chef_handler resource is to add the following to the client.rb file (or solo.rb file):

require 'chef/handler/json_file'
report_handlers << Chef::Handler::JsonFile.new(:path => "/var/chef/reports")
exception_handlers << Chef::Handler::JsonFile.new(:path => "/var/chef/reports")

Using a custom handler

The next step is to use a custom handler, i.e. one that is not built into the chef-client. The extra dimension here is that you need to get the handler code onto the node before it is run. You can do this by putting your handler source in the ‘files’ directory of your cookbook and using a cookbook_file resource to transfer it to the node, e.g.:

cookbook_file "#{node["chef_handler"]["handler_path"]}/your_handler.rb" do
  source "your_handler.rb"
  owner "root"
  group "root"
  mode 00755
  action :nothing
end.run_action(:create)

The expression #{node["chef_handler"]["handler_path"]} gives you the directory in which the chef_handler resource expects to find your handler. As previously, we run the cookbook_file resource immediately to ensure the handler file is created during the compile phase, before the handler itself is run. If the handler doesn’t run until the converge phase, you can replace the last two lines with:

  action :create
end

The chef_handler::default recipe can also be used to transfer handlers to the target node. You will need to make a copy of the chef_handlers cookbook and place your handlers in the ‘files/default/handlers’ directory of that cookbook (or copy the code from the default recipe into your handler cookbook).

To enable this handler, you would then define a chef_handler resource that refers to the transferred handler file:

chef_handler 'YourHandlerModule::YourHandler' do
  source "#{node["chef_handler"]["handler_path"]}/your_handler.rb";
  action :nothing
end.run_action(:enable)

For a real example you can try using, see Julian Dunn’s cookbook version handler.

Writing your own handler

To write a handler, you need to create a Ruby class that inherits from Chef::Handler and has a report method:

module YourHandlerModule
  class YourHandler < Chef::Handler
    def report
       # Handler code goes here
    end
  end
end

Put this code in the ‘your_handler.rb’ file in the ‘files’ directory of your handler cookbook.

Within the handler you can write arbitrary Ruby code, and you can use ‘run_status’ information available from the chef-client run. ‘run_status’ is basically the same information that is output by the ‘error_report’ handler. The information can be accessed through the following methods in the Ruby code:

  • data – a hash containing the run status
  • start_time, end_time, elapsed_time – times for the chef-client run
  • updated_resources, all_resources – the resources in the chef-client run
  • success?, failed? – methods that indicate if the chef-client run succeeded or failed

In the handler, you can access these directly or via the run_status object, e.g. ‘run_status.success?’ is equivalent to ‘success?’.

So for example, I can write the following handler:

require 'chef/log'

module TestApp
  module Handlers
     class UpdatedResources < Chef::Handler
      
      def report
        
        if success?
          Chef::Log.info('Running Updated Resources handler after successful chef-client run')
        else
          Chef::Log.info('Running Updated Resources handler after failed chef-client run')
        end

        Chef::Log.info('Updated resources are:')
        updated_resources.each do |resource|
          Chef::Log.info(resource.to_s)
        end
      end     
    end
  end
end

In the above, I use the success? method to test whether the run succeeded or not, and the updated_resources to loop through each resource updated during the run, and print it out as a string.

If you run ‘chef-client -linfo’ on the target node, you will see output similar to:

[2014-09-01T14:48:02+00:00] INFO: Running Updated Resources handler after successful chef-client run
[2014-09-01T14:48:02+00:00] INFO: Updated resources are:
[2014-09-01T14:48:02+00:00] INFO: chef_handler[Chef::Handler::JsonFile]
[2014-09-01T14:48:02+00:00] INFO: chef_handler[Chef::Handler::ErrorReport]
[2014-09-01T14:48:02+00:00] INFO: chef_handler[TestApp::Handlers::UpdatedResources]
  - TestApp::Handlers::UpdatedResources
Running handlers complete

In this case, the only updated resources in the chef-client run are the three chef_handlers in my handlers recipe. ‘chef_handler[TestApp::Handlers::UpdatedResources]’ is how the UpdatedResources handler is represented as a string.

You can also use the ‘to_hash’ method in place to ‘to_s’ in the above, to see what information is available about the resource. If you did this, you would see something like the following:

2014-09-01T14:54:25+00:00] INFO: {:name=>"TestApp::Handlers::UpdatedResources", :noop=>nil, 
:before=>nil, :params=>{}, :provider=>nil, :allowed_actions=>[:nothing, :enable, :disable], 
:action=>[:nothing], :updated=>true, :updated_by_last_action=>false, 
:supports=>{:report=>true, :exception=>true}, :ignore_failure=>false, :retries=>0, :retry_delay=>2, 
:source_line=>"/var/chef/cache/cookbooks/testapp/recipes/handlers.rb:41:in `from_file'", 
:guard_interpreter=>:default, :elapsed_time=>0.000480376, :resource_name=>:chef_handler, 
:cookbook_name=>"testapp", :recipe_name=>"handlers", 
:source=>"/var/chef/handlers/testapp_handlers.rb", :arguments=>[], 
:class_name=>"TestApp::Handlers::UpdatedResources"}

From this, you might decide just to print out the name of the resource, e.g.:

        updated_resources.each do |resource|
          Chef::Log.info("Updated resource with name: " + resource.to_hash[:name])
        end

Don’t expect ‘to_hash’ to work in all cases – it depends on whether it has been implemented in the relevant Ruby class. And be aware it may not be the best way to access the data. For example, the above can better be achieved using the name method on the resource, i.e. ‘resource.name’ in place of ‘resource.to_hash[:name]’. Over time, you’ll want to get comfortable with using debuggers to look at the data and reading the chef source code to understand how best to access it.

As well as ‘run_status’, you also have access to the run_context. I’ve included links to a couple of examples using the run context below.

Chef Handler with a parameter

If you want to pass a parameter into your handler, you need to add an initialize method to the handler, as below. This allows you to set the arguments attribute in chef_handler, which is passed to the initialize method as a parameter called ‘config’.

    class UpdatedResourcesToFile < Chef::Handler

      def initialize(config={})
        @config = config
        @config[:path] ||= "/var/chef/reports"
        @config
      end
            
      def report
        File.open(File.join(@config[:path], "lastrun-updated-resources.json"), "w") do |file|
          updated_resources.each do |resource|
            file.puts(resource.to_s)
          end
        end       
      end     
    end

What the above initialize method does is store the parameters in an attribute of the handler class called ‘@config’, and also sets a default value for the ‘path’ parameter. The ‘@’ indicates an attribute and means that the value is available to other methods in the class. The ‘report’ method can then access the path parameter using ‘@config[:path]’.

To enable the above handler, put something like the following in your handler recipe. You will also need to add a resource to create the path, if it doesn’t already exist.

chef_handler "TestApp::Handlers::UpdatedResourcesToFile" do
  source "#{node["chef_handler"]["handler_path"]}/testapp_handlers.rb"
  arguments :path => '/tmp'
  action :nothing
  supports :report=>true, :exception=>true
end.run_action(:enable)

In the above, I am overriding the default path value so that the file is written out to ‘/tmp’ instead.

Examples of handlers

Here are some examples of handlers that you may find useful:

Managing software versions in a multi-node topology with knife-topo

Here’s a simple example of using knife-topo with Chef to manage the versions of software deployed in a multi-node topology. This post assumes you have Vagrant and ChefDK installed.

The example topology that we’ll work with consists of three nodes as shown below:

topology

However, in this post, we’ll only define and deploy two of the nodes using knife-topo, Chef and Vagrant. I’ll introduce the third node in a later blog to illustrate some more complex uses of the topology JSON (such as conditional attributes). You can also look at the full example in the test-repo in knife-topo’s github repository.

Describing the topology

Define the nodes

The first step is to describe the topology that we want. Below is a minimal topology JSON file describing the two nodes in the topology (‘test1’). The ‘name’ property for the nodes defines the node name that will be used in Chef. The ‘ssh_host’ property specifies the address to use in bootstrapping the nodes.

{
  "name": "test1",
  "nodes": [
    {
      "name": "appserver01",
      "ssh_host": "10.0.1.3"
    },
    {
      "name": "dbserver01",
      "ssh_host": "10.0.1.2"
    }
  ]
}

This is sufficient to allow knife-topo to bootstrap the appserver and dbserver nodes, so that they will be managed by Chef. You can try this out by following the instructions later to download a test repo, run Vagrant and chef-zero, however, the results may be rather underwhelming. A data bag describing the topology will be created, Chef will be bootstrapped onto the two nodes, and they will register themselves with the server, but that’s all. You will also get warnings during bootstrap that the nodes have no runlists.

Define the runlists

Let’s fix the missing runlists now:

{
  "name": "test1",
  "nodes": [
    {
      "name": "appserver01",
      "ssh_host": "10.0.1.3",
      "run_list": [
        "recipe[apt]",
        "recipe[testapp::appserver]",
        "recipe[testapp::deploy]"
      ]
    },
    {
      "name": "dbserver01",
      "ssh_host": "10.0.1.2",
      "run_list": [
        "recipe[apt]",
        "recipe[testapp::db]"
      ]
    }
  ]
}

The recipes in these runlistsinstall the software (mongodb, nodejs, and a test application) on the nodes, and are provided in the test repo, along with a Berksfile to manage the dependencies. However, they do not specify what versions of the software should be installed. If you try running the example now, the latest versions will be chosen. But we want to get control of what’s on our topology. We can do this by defining the software versions as attributes in an environment cookbook specific to our topology.

Defining specific software versions as attributes in an environment cookbook

Here’s the first addition that’s needed to deploy specific software versions. We add a “cookbook_attributes” section, specify the name of the environment cookbook (‘testsys_test1’) and the attribute file we want (‘softwareversion’), and the necessary version attributes to install NodeJS version 0.10.28, and MongoDB version 2.6.1.

{
  "name": "test1",
  "nodes": [
    ...
  ],
  "cookbook_attributes": [
    {
      "cookbook": "testsys_test1",
      "filename": "softwareversion",
      "normal": {
        "nodejs": {
          "version": "0.10.28",
          "checksum_linux_x64": "5f41f4a90861bddaea92addc5dfba5357de40962031c2281b1683277a0f75932"
        },
        "mongodb": {
          "package_version": "2.6.1"
        }
      }
    }
  ]
}

The second change is to add the environment cookbook to the runlists for the two nodes:

{
  "name": "test1",
  "nodes": [
    {
      "name": "appserver01",
      "ssh_host": "10.0.1.3",
      "run_list": [
        "recipe[apt]",
        "recipe[testapp::appserver]",
        "recipe[testapp::deploy]",
        "testsys_test1"
      ]
    },
    {
      "name": "dbserver01",
      "ssh_host": "10.0.1.2",
      "run_list": [
        "recipe[apt]",
        "recipe[testapp::db]",
        "testsys_test1"
      ]
    }
  ],
  "cookbook_attributes": [
    ...
  ]
}

The topology JSON is now ready to use – go ahead and try it.

A word of warning if you ran knife-topo with runlists but no specific software versions – there’s currently a bug in the mongodb cookbook that means it can’t handle downgrades. So you may want to destroy the virtual machines (‘vagrant destroy’) and start again (or see troubleshooting for an alternative).

Running the example

These instructions may be enough to get you started. If you want more details or encounter problems, see these instructions and the knife-topo readme.

Setting up the environment

Install Vagrant and ChefDK if you do not already have them. Neither are required for knife-topo, but this post uses them.

Install knife-topo using gem. You may need to use ‘sudo’.

gem install knife-topo

To get the test repo, it’s easiest to download and unzip the latest knife-topo release from github (replace ‘0.0.7’ with the latest release number):

unzip knife-topo-0.0.7.zip -d

The test-repo gives you a multi-node Vagrantfile similar to what that I described in my previous post. Use that to create the virtual machines:

cd ~/knife-topo-0.0.7/test-repo
vagrant up

When the virtual machines have started (this can take a while the first time), you can start chef-zero. If you have chefDK, you can use the embedded chef-zero, or you can install chef-zero as a gem. Here’s how to start the embedded chef-zero on Ubuntu:

/opt/chefdk/embedded/bin/chef-zero -H 10.0.1.1

Importing the pre-requisites

Leave chef-zero running, open a new terminal and go to the test-repo, then upload the required cookbooks using berkshelf:


cd ~/knife-topo-0.0.7/test-repo
berks install
berks upload

Save your topology file as “mytopo.json” in the test-repo, then import your topology into your Chef workspace:

knife topo import mytopo.json

Do not name the file “test1.json” or it will cause an error later on when we load the test1 topology from the data bag file (knife looks for data bag files in the current directory BEFORE it looks in the data bag directory).

If you are using the final topology JSON, ‘knife-topo import’ will have generated some useful artifacts for you: in particular, an attribute file in the topology cookbook containing the specified software versions. This file is in “knife-topo-0.0.7/test-repo/cookbooks/testsys_test1/attributes/softwareversion.rb” and should look like:

#
# THIS FILE IS GENERATED BY THE KNIFE TOPO PLUGIN - MANUAL CHANGES WILL BE OVERWRITTEN 
#
# Cookbook Name:: testsys_test1
# Attribute File:: softwareversion.rb
#
# Copyright 2014, YOUR_COMPANY_NAME
#
normal['nodejs']['version'] = "0.10.28"
normal['nodejs']['checksum_linux_x64'] = "5f41f4a90861bddaea92addc5dfba5357de40962031c2281b1683277a0f75932"
normal['mongodb']['package_version'] = "2.6.1"

Bootstrapping the topology

knife topo create test1 --bootstrap -xvagrant -Pvagrant --sudo

This command uploads the topology cookbook, creates the test topology in the Chef server and bootstraps all nodes that provide an ‘ssh_host’. After it finishes, you should have a working two node topology with the specified software versions. The test application welcome screen is at: http://localhost:3031

Multi-node topologies using Vagrant and chef-zero

I’ve been bootstrapping a lot of multi-node topologies recently. Here’s some things I learned about making it easier with Vagrant and chef-zero. I also wrote a Chef knife plugin (knife-topo) – but more of that in a later post.

Multiple VM Vagrant files

The basics of a multi-VM vagrant file as described in the Vagrant documentation is that you have a “config.vm.define” statement for each machine, and setup the machine-specific configuration within its block. This looks something like:

Vagrant.configure("2") do |config|
  config.vm.box = "ubuntu64"
  config.vm.box_url = "http://files.vagrantup.com/precise64.box"
  config.vm.synced_folder "/ypo", "/ypo"
  config.vm.synced_folder ".", "/vagrant", disabled: true

  # setup appserver
  config.vm.define "appserver" do |appserver_config|
    appserver_config.vm.hostname = "appserver"
    # other VM configuration for appserver goes here
  end

  # setup dbserver
  config.vm.define "dbserver" do |dbserver_config|
    dbserver_config.vm.hostname = "dbserver"
    # other VM configuration for dbserver goes here
  end
end

Lines 2-3 tell Vagrant what to put on the machines, and where to get it. Lines 4-5 are optional: line 4 sets up a directory on the host machine which will be accessible to each guest virtual machine; line 5 disables the default share.

Lines 7-10 and 13-15 each define a virtual machine (appserver and dbserver).  The two config variables (appserver_config and dbserver_config) are like the overall ‘config’ variable, but scoped to a specific machine.

This is fine when you don’t have much configuration information for each node, but it gets hard to read, laborious and error-prone to maintain when you start having lots of configuration statements or lots of nodes.

Separating out the configuration of the virtual machines

I got this approach from a post by Joshua Timberman in the days before I learned Ruby, and it was a great help in cleaning up my Vagrantfile.  Separating out the machine (node) configuration makes it so much easier to understand and update the Vagrant file.

First, we define a structure (Ruby hash) that describes how we want the nodes to be configured:

nodes = {
  :dbserver => {
    :hostname => "dbserver",
    :ipaddress => "10.0.1.2",
    :run_list => [ "role[base-ubuntu]", "recipe[ypo::db]" ]
  },
  :appserver => {
    :hostname => "appserver",
    :ipaddress => "10.0.1.3",
    :run_list => [ "role[base-ubuntu]", "recipe[ypo::appserver]"],
      :forwardport => {
        :guest => 3001,
        :host => 3031
    }
  }
}

The above defines two nodes, dbserver and appserver.  Lines 3-5 and 8-10 set up their hostnames, IP addresses and the run lists to use with Chef provisioning. Lines 11-13 set up port forwarding, so that what the application server listens for on port 3001 will be accessible through port 3031 on the host machine.

Second, we use the nodes hash to configure the VMs, with code something like this:

Vagrant.configure("2") do |config|
  nodes.each do |node, options|
    config.vm.define node do |node_config|
      node_config.vm.network :private_network, ip: options[:ipaddress]
      if options.has_key?(:forwardport)
        node_config.vm.network :forwarded_port, guest: options[:forwardport][:guest], host: options[:forwardport][:host]
      end
    end
    node_config.vm.hostname = options[:hostname]
    #
    # Chef provisioning options go here (see below)
    #
  end
end

Line 2 loops through the nodes hash, defining the configuration of each machine. Line 4 sets up the machine on a private network with a fixed IP address as specified in the nodes hash. Lines 5-7 setup port forwarding, if configured in the nodes hash. Line 9 sets up the hostname from the nodes hash.

Multiple nodes with a single chef-zero

chef-zero is an in-memory Chef server – a replacement for chef-solo. It’s a great way to test recipes, because it is just like a ‘real’ chef server. By default, it listens only locally, on http://127.0.0.1:8889. However, you can run it so that other nodes can access it – for example, all of the VMs in the Vagrantfile.  Doing this lets you use recipes requiring search of other nodes’ attributes, e.g. for dynamic configuration of connections between nodes. When you’re done testing, you just stop chef-zero – no cleanup of the server required.

You can install chef-zero as a gem:

gem install chef-zero

The machines defined above are all on the 10.0.1.x private network. So all we need to do is start chef-zero on that network, using:

chef-zero -H 10.0.1.1

and configure the VMs with a chef server URL of “10.0.1.1:8889”. Because the Vagrantfile is reading configuration from the knife.rb file, all we need to do is set variables in knife.rb – something like:

node_name                "workstation"
client_key               "#{current_dir}/dummy.pem"
validation_client_name   "validator"
validation_key           "#{current_dir}/dummy.pem"
chef_server_url          "http://10.0.1.1:8889"

The dummy.pem can be any validly formatted .pem file.

Using Chef provisioning with Vagrant

Running chef-client using Vagrant provisioning

Sometimes it is very useful to have Vagrant provision your machines using chef-client. The Vagrant documentation provides you with the basics for doing this, but not a lot else. The first useful tip (again from Joshua Timberman) is to use your knife.rb configuration file rather than hard-coding the server URL and authentication information. To do this, add the following at the top of the Vagrantfile:

require 'chef'
require 'chef/config'
require 'chef/knife'
current_dir = File.dirname(__FILE__)
Chef::Config.from_file(File.expand_path('../.chef/knife.rb'))

These five lines read in the configuration from your knife.rb file (change ‘../chef/knife.rb” to be the path to your knife.rb) and make it available to be used as below, which shows code to that should be inside the “nodes.each” block described previously:

node_config.vm.provision :chef_client do |chef|
  chef.chef_server_url = Chef::Config[:chef_server_url]
  chef.validation_key_path = Chef::Config[:validation_key]
  chef.validation_client_name = Chef::Config[:validation_client_name] 
  chef.node_name = options[:hostname]
  chef.run_list = options[:run_list]
  chef.provisioning_path = "/etc/chef"
  chef.log_level = :info
end

Lines 2-4 setup the connection to the Chef server. Lines 5-6 setup the node name and run list from the options hash we defined earlier. Setting the provisioning_path to “/etc/chef” puts the client.rb in the place that chef-client expects it, and setting log_level to “:info” provides a decent level of output (change to “:debug” if you have problems with the provisioning).

Cleaning up the Chef Server on vagrant destroy

When you destroy a Vagrant VM that you have provisioned using a Chef Server, you need to delete the node and client on the Chef server before you reprovision it in Vagrant. The following lines added into the node_config.vm.provision block above should do this for you:

chef.delete_node = true
chef.delete_client = true

However, there is an issue with this – it does not work for me, so I always have to use knife to manually clean up:

knife node delete appserver
knife client delete appserver

Controlling the version of chef-client

Initially I was running into issues because the chef client used by the Vagrant chef provisioner was back-level. This post on StackOverflow solved the problem for me. Install the chef-omnibus Vagrant plugin:

vagrant plugin install vagrant-omnibus

and specify the version of chef-client you want to use in the node configuration part of the Vagrant file:

node_config.omnibus.chef_version = "11.12.8"

or:

node_config.omnibus.chef_version = :latest

to install the latest version.